333 



Repeats in genomic DNA: mining and meaning 

JerzyJurka 



For hundreds of millions of years, perhaps from the very 
beginning of their evolutionary history, eukaryotic cells have 
been habitats and junkyards for countless generations of 
transposable elements, preserved in repetitive DNA 
sequences. Analysis of these sequences, combined with 
experimental research, reveals a history of complex 
'intracellular ecosystems' of transposable elements that are 
inseparably associated with genomic evolution. 

Addresses 

Genetic Information Research Institute, 1 1 70 Morse Avenue. 
Sunnyvale. CA 94089, USA; e-mail: jurka@charon.ginnst.org 

Current Opinion in Structural Biology 1 998, 8:333-337 

http://biomednet.com/eIecref/0959440X00800333 

(Ci Current Biology Ltd ISSN 0959-440X 

Abbreviations 

LI -EN endonudeolytic domain in Li reverse transcriptase 

LINE long interspersed nuclear element 

LTR long terminal repeat 

MIR mammalian-wide interspersed repeat 

SINE . short interspersed nuclear element 

TE transposable element 

TSD target site duplication 

Introduction 

Repetitive DNA is a major component of eukaryotic 
f^cnomes. Understanding its origin, evolution, and genetic 
impact upon the host DNA is therefore of fundamental 
importance for genome studies. There are two major 
groups of repeats in eukaryotic genomes: tandemly repeat- 
ed satellites, usually confined to specific chromosomal 
regions; and the repeats interspersed with genomic DNA 
that arc the major focus of this review. Interspersed 
repeats represent mostly inactive copies of a wide variety 
of contemporarily and historically active transposable ele- 
ments (TEs) such as: retroelcments and DNA trans- 
posons, which can each be further subdivided into distinct 
classes [1]. Repetitive sequences have been recruited as 
functional components of eukaryotic genomes, which doc- 
uments their contribution to genomic evolution [2-6], 
They are also an important source of knowledge about the 
biology of active TEs. The emerging picture, bolstered by 
recent research, is that TKs are not merely 'parasites*. 
Rather, they are integral players in genomic evolution, 
showing either a 'selfish' or an 'altruistic* nature, depend* 
ing on different evolutionary circumstances. 

Reconstruction and analysis of repetitive DNA 

As stated above, interspersed repetitive sequences repre- 
sent inactive (pseudogene) copies of historically or contem- 
porarily active TEs. The study of a new TE usually begins 
with the identification of its repeated copies, followed by 
sequence alignment, classi negation into subfamilies (if 



applicable) and construction of consensus sequences [7]. 
Apart from the original TEs themselves, consensus 
sequences represent the best available approximations of 
the original active TEs that generated the repeats. Figure 1 
illustnites the relationship between the similarities of indi- 
vidual repeats to perfect consensus sequences as compared 
to similarities between repeats themselves [7]. According to 
Figure 1, repeats 37-52% similar to each other umII be 
55-70% similar to their perfect consensus sequences. 
Without such improvement in similarities, the search for 
diverse repeats and other biologically meaningful sequence 
comparisons may be counterproductive. 

Figure 1 



y 




0.25 t ■ ■ ■ ■ 1 ■ ■ ■ ■ I ■ ■ ■ ■ I X 
0.25 0,5 0.75 J 

j Current Opinion in Structural Biology 

The similarities between a source gene and its repeats as a function of 
the similarities between the repeals. The x variable indicates the 
average similarity between repeats sharing a common source gene; y 
represents the average similarity of repeats to their source gene that 
can be approximated by a consensus sequence. For example, repeats 
that are on average 50% similar to each other will be >68% similar to 
their ideal consensus sequence. Adapted with permission from {7). 



One can reconstruct ancestral TEs even with limited 
sequence data, especially if individual copies are not very 
diverse. Additional information may be taken intoaccoimt^ 
such as the high mutability of CpG dinuclcotidcs or the 
presence of open reading frames in which nonsense muta- 
tions can be reversed. This has been dramatically demon- 
strated for the T/'Z-like DNA tran.sposon from fish, named 
Sleeping Reftuty^ whose transpnsase was reconstructed from 
a dozen inactive copies. Its activity has been demonstrat- 
ed not only in the fish from which it originated, but also in 
human HeLa cells f8'*l. This work, and an earlier study 



334 Sequences and topology 



demonstrating the cransfcr of a manner element from 
Drosophila to Leishmansa [9**], are important steps towards 
application of DNA transposons in genomic studies. 

Reconstructions of THs are very labor intensive and 
require biological insight but they often remain unpub- 
lished. In order to promote the dissemination of this infor- 
mation and to credit the individual effort that goes into 
producing it, a new electronic publication entitled 
Repbase Update was established [10*]. Repbase Update 
represents a systematic attempt to integrate consensus 
sequence data, nomenclature, biological classification and 
other relevant information into a coherent resource neces- 
sary for seciucnue studies. To date, over 950 different 
repetitive sequence families and subfamilies have been 
compiled from all available eukaryotic sequence data (sec 
Tabic 1). Of these, over 800 arc interspersed repeats. Most 
interspersed repeats from vertebrates and plants (-80%) 
have been assigned to one of the following major cate- 
gories: non-long terminal repeat (LTR) retrotransposons or 
retroposons also known as SINEs and LINEs, and LTR- 
retrotransposons including retroviruses and DNA trans- 
posons. The remaining nonplant» nonvertebrate repeats 
come from very diverse species, ranging from protozoans 
to octopuses, and are temporarily collected under the arbi- 
trary name of •invertebrates'. In this group, the fraction of 
interspersed repeats assigned to a particular category is sig- 
nificantly lower (30-40%), mostly due to insufficient com- 
parative sequence data necessary' for the construction of 
reliable consensus sequences. This group of repeats is 
expected to hold many 'missing links* in our understand- 
ing of the origin and evolution of TEs. 

Human and rodent sequences can be screened against the 
most recent version of Repbase Update using pu[)lic 
servers [11.12]. Repeat annotation and masking is recom- 
mended prior to exon identification |13,14] but Repbase 

Table 1 



The current content of Repbase Update. 



Type of repeats 


File name 


Number of 
(eub) families 


Human repeats 


humrep.ref 


284 


Alu subfamilies (primate) 


humsub.ref 


16 


Processed pseudogenes (human) 


pseudoj-ef 


20 


Rodent repeats 


rodrep.ref 


157 


Other mammalian repeats 


mamrep.ref 


96 


Other vertebrate repeats 


vrtrep.ref 


74 


Plant repeats 


pinrep.ref 


87 


Invertebrate repeats 


invrep.rof 


222 


Simple repeats (microsatellites) 


simple.ref 


131 


Total 




1087 


Unique 




956 



Updated human and rodent collections are also avaSabie from public 
servers for the automatic annotation of DNA sequences [11,1 21. Recently 
computed proportions of repeats in the nonredundant human sequence 
data are as follov/s: Alu (1 2.3%); UNEI <1 1 .9%); MIR (1 .6%); UNE2 
(2.1%); UR retrotransposons and endogenous retroviruses (5.6%); DNA 
transposons (1 .8%) ; simple repeats (1 .4%): other ^0.35%. 



Upgrade is increasingly being used for the direct studies of 
repetitive DNA. 

The genomic fossil record 

The genomic fossil record of past retropositions can be of 
great value not only for studies of TEs themselves, but also 
for population and phylogenetic studies of their hosts. For 
example, young Alu (SINE) subfamilies have been useful 
for human population studies. To date, there are five 
known Alu subfamilies (Yal, Ya5, Yb5, Ya8 and \'b8) active- 
ly proliferacing in humans [10,15]. Recent innovative stud- 
ies of 57 Ya5 Alu scc|uences, 1.'^ of which arc polymorphic 
in the human gene pool, led to an estimate of human effec- 
tive population size using coalescence theory [16']. This is 
only the latest in a series of human population studies 
based on Alu rctroposition. 

Turning to older short interspersed nuclear element 
(SINE) families in mammals, Okada's group [17'*] 
obtained a phylogenetic resolution of the long disputed 
relationship among whales, ruminants, hippopotamuses 
and pigs. They have shown that two SINE families, called 
CHR-1 and CHR-2, are present exclusively in the 
genomes of whales, ruminants and hippopotamuses, which 
together form a monophyletic group distinct from that of 
pigs and camels. This finding contradicts previous phyto- 
genies and illustrates the powerful use of the genomic fos- 
sil record in complementing the paleoniological record 
which is particularly difficult to obtain for whales. 

Another whale-related development was the identification 
of homology between the basic units of common satellites 
and LI elements, representing the most abundant LINE 
elements in mammals [18*]. Satellites have long been 
viewed as a product of unequal crossing over, however, 
there is no evidence that they can originate f/t novo from 
nonfunctional .*junk' DNA. The homology between LI 
and these satellite.s supports this scenario and rai.ses many 
interesting questions about satellite and genomic evolu- 
tion. Another interesting link between satellites and TEs 
is the homology between the centromere-associated pro- 
tein (CENP-B) and the pogo family of TEs although bio- 
logical interpretation of this fact remains tentative [19,20]. 

Retro (trans) position: a continuation of tlie 
transition from the RNA to the DNA world? 

Very little is known about the origin of TEs but it is con- 
ceivable that the *TE world', can be traced all the way back 
to the beginning of the transition from the hypothetical 
RNA-based genome to the DNA-based one. From this 
point of view; the entire genomic DNA miglit have evolved 
with close participation of TEs. starting with rctroposon-like 
elements. Many TEs might have evolved into parasites, par- 
ticularly those that can migrate between different hosts, but 
some may still retain their original properties as *genome 
builders'. The examples of D^vsophtia non-LTR retroposons 
HeT-A and TART, which maintain telomeres in Dnmphlla 
[21 ••,22], ujmbined with the recently reported homology 



Repeats in genomic DNA: mining and meanmg Jurka 335 



bccwcen tclomcrascs and reverse transcriptases [23",24**], 
bring us closer lo this broad perspective (25]. 

In this context, it may be worthwhile to revisit recent 
research on the extensively studied mammalian LI 
(LINEl) elements. The origin of active nnammalian LI 
elements remains obscure, but they have produced a suc- 
cession of numerous subfamilies during the past 100 mil- 
lion years or so [26], and they continue to be active at least 
in humans and rodents (27V28]. In spite of their assumed 
*seirishness\ Ll elements seem to exhibit some remnants 
i»f *altruisiic' features that are compatible with active par- 
ticipation in genome evolution. They are responsible for 
adding over 24% of the DNA to the human genome, only 
about half of which is Ll DNA (sec legend of Table 1 and 
[12]). L^nlike other LINE elements that arc parasitized by 
SlNEs homologous to their 3' ends [29], Lis apparently 
retropose a large variety of SINE elements and mRNAs 
([30"], see below) that have no obvious structural relation- 
ship to their own RNA, with the possible exception of 
poly(A) tails [31]. This is consistent with a recent study 
demonstrating the ability of Ll reverse transcriptase to effi- 
ciently generate cDNA from RNA with no .sequence speci- 
ficity and including transcripts from cellular genes [32*]. 
Even the affinity of Ll reverse transcriptase for polyadeny- 
latcd R.NA hanging around the ribosomal system [31] may 
be interpreted as a remnant of the original participation of 
Ll predecessors in the retroposition of protein encoding 
RNA. Another relevant property may be the ability of Ll 
reverse transcriptase to heal chromosomal breaks, although 
there is some debate as to whether this cannot be attributed 
to nonhomologous recombination events [33,34]. 

Diversity and co-evolution of TEs 

The genomic fossil record deposited in eukaryotic 
genomes shows that autonomous TEs tend to be accom- 
panied by nonautonomous companions that are unable to 
proliferate themselves. Examples include transposon dele- 
tion fragments [3S,36], SINE elements homologous to 3' 
ends of LINE elements [29], and defective Ll R retro- 
transposons, including defective endogenous retroviruses. 
To multiply, the first group must be able to use transposase 
from intact DNA transposons, SINE proliferation depends 
on LTNR-encoded reverse transcriptase and the remaining 
rctroclements probably rely on intact viruses for their 
reproduction. There may be a delicate balance l^twecn 
the autonomous and nonautonomous groups of TEs, anal- 
ogous t() the balance between species in complex ecosys- 
tems. Autonomous elements proliferating out of control 
may destroy their hosts. Nonautonomous elements may 
destroy themselves by 'successfuT competition for the 
reverse transcriptase or transposase produced by the 
autonomous TEs. Iransposase titration by defective trans- 
posons has been discussed among possible factors for the 
restriction of the activity of mariner-like transposable ele- 
ments in natural populations [36], although more special- 
ized mechanisms, such as overproduction inhibition, and 
missense mutation effects are viewed as more prominent 



events in limiting proliferatitm of DNA transposons. 
Multiple LINEl and SINE (Alu, Bl, B2, BCl, etc.) sub- 
families in mammals may be viewed as examples of the 
ongoing co-evolution that is driven by competition for 
reverse transcriptase r26,30**,37]. LINE2 and mammalian- 
wide interspersed repeat (MIR) elements |12| might have 
become extinct as a result of similar competition. Among 
general mechanisms for the restriction of TEs on the 
genomic side, suppression by CpG methylation and hete- 
rochromatinization have recently been discussed [4,38,39]. 
Overall, our knowledge of the mechanisms controlling 
TEs at the genomic level is still fragmentary [40]. 

C^o-cvoIiJtion between autonomous and nonautonomous 
elements may not be sufficient to account for the diversity 
of endogenous retroviruses and retroviral-like elements in 
mammals. Almost half of all the human repetitive elements 
deposited in Rcpbasc Update [10*] arc cither diverse LTRs 
or fragments of viruses and LTR rctrotransposons, although 
they represent less than 6% of the human genome (see leg- 
end of Table 1). In this context, it is worth mendoning a 
renewed interest in co-evolution between endogenous and 
exogenous retroviruses that could benefit the host [41,42]. 
Other related possibilities include recurrent infections and 
recombinations between distantly related viruses (W 
Kapitonov and J Jurka, unpublished data). 

Targeting the mammalian genome 

Sequence analysis of target site duplications (TSDs) of retro- 
posed elements from mammals [30**], combined with the 
independent discovery of the endonuclcolytic domain in Ll 
reverse transcriptase (Ll -EN, reviewed in [31 1), brought 
about a recent breakthrough in our understanding of retro- 
poson integration in mammals. The consensus sequence of 
TSDs and adjacent regions for Ll, Alu, ID(BCl), Bl, 82, 
and processed pseudogenes is TTIAAAA(N)(,_^TVrNIR, 
where R denotes purines, Y represents pyrimidines and N is 
any base. The vertical bars show predicted positions of 
breakpoints on the opposite strands of double-stranded 
DNA |30'*,37]. Tl^AAAA resembles consensus sequence 
nicked by the Ll-EN [43"|, an additional argument impli- 
cating Ll reverse transcriptase in the retroposition of nonau- 
tonomous retroposons. The general consensus sequence of 
the TSDs may combine different subclasses of targets. For 
example, targets beginning with TTLAGAA are longer on 
average than the targets beginning with IT^IAAAA (J Jurka, 
unpublished data). Different target preferences rnay be relat- 
ed to different active Lis [27*]. 

The conserved sequences around both breakpoints in the 
consensus sequence given above appear to be different from 
each other, but separate analyses indicate that both 
sequences are enriched with kinluble TA, CA and TG din- 
ucleotide steps, which suggests a similar mechanism by 
which both breaks arc generated [44*], This mechanism may 
be of general significance since the kinkablc dinucleotides 
are conserved in targets both for DNA transposons and for 
insertion elements in bacteria [44']. 



336 Sequences and topology 



In analog co the model of intergratinn of insect R2 non- 
LTR retroposon [45], the reverse transcription of mam- 
malian rctroposons may be primed by the .V IDNA ends 
exposed by nicking. AUh«uigh self-priming of rccroposable 
RNA has l>ecn recently demonstrated in vitm [46], its role 
in the retroiX)sition nf mammalian rctn»posons may be 
marginal if any. 

It has long been known that double-stranded breaks stimu- 
late homologous recombination. Therefore, DNA targets 
exposed to LI -EN nicking acivit>' may be recombi national 
hot spots in mammalian genomes, 'I'his may have implica- 
tions for the understanding of at least some of the fragile 
chromosomal sites involved in the origin of genetic diseases. 

Conclusions 

The reverse flow of information from RNA to DNA might 
have had a definite beginning in the history of life, but it has 
never ended. It remains an integral part of the ongoing 
genomic evolution in cukaryoiic species. It is manifested in 
active retroposons and in their fossil record as interspersed 
repetitive DNA. These are the major conclusions emerging 
from recent progress in the field. Based on these conclusions, 
the one-dimensional interpretation of TEs as *parasices' or 
'selfish' elements should be transformed into a more bal- 
anced view, witli their diverse n)les comparable ut the bi(»- 
logical roles of individual species in evolving ecosystems. As 
the diverse world of TEs continues to emerge with new 
sequence data, TEs arc increasingly being explored in a 
broad range of biological problems, frcim phylogcnctic and 
population studies to gcnorhc engineering. 

Acknowledgements 

Many <uiC!ttanJlnf; iind relevant ciintrihutitim prior to V-IMH could not lie 
reviewed here, 1 selecied a number of hrnud rcceni rcvicwN comi^cnsaic 
for this deficiency. I would like to thank Vladimir Kapitonov. Paul 
Kiiinowski. Dnrorhy Miinro -jncl Joliinra Wulichiewic/ for help wirh editing 
this nianuscripc. This work was supported bv the National Institutes of 
Health Rrant 1 P4i LM(J6252. 

References and recommended reading 

Papers of particular interest, published within the anriual period o1 review, 
have been highlighted as: 

• of special interest 
••of outstanding interest 

1 . Capy P: Classification of transposable elements. In Moiecular 
Bioiogy Intelligence Unit: Dynamics and Evolution of Transposable 
Elements. Edited by Capy P, Bazin C. Hiquel 0, Langin T. 
Georgetown, Texas: Landes Bioscience; 1998^7-52. 

* 2. Brostus J. Hedge H: Reverse transcriptase: mediator of genomic 
plasUclty. Virus Genes 1996. 11:163 179. 

3. Levin HL: If s prime time for reverse transcriptase. Cell 1 997, 88:5-8. 

4. Kidwell MG, Lisch D: Transposable elements as sources of 
variation in aninrials arid plants. Proc Natl Acad Sci USA 1997, 
94:7704-7711. 

5. Tomilin NV: Control of genes by mammalian retroposons. Rev 
Cytol 1998, in press. 

6. Chu WM, Ballard R, Carprck BW. Williams BR, Schmid GW: 
Potential Alu function: regulation of the activity of double-stranded 
RNA^actlvated kinase PKR. Mol Cell Biof 1 998. 18:58-68. 

7. Jurka J: Approaches to Identification and arialysis of interspersed 
repetitive DNA sequences. In Automated DNA sequencing and 



analysis. Edited by Adams MD, Fields G, Venter JG. San Diego: 
Academic Press Incorporated; 1 994:294-298. 

8. Ivies Z, Hackett PB, Plasterk RH. Izsvak Zi Molecular reconstruction 
•• of Steeping Beauty^ a Tcl-tIke transposon from fish, and Its 

transposition in human cells. Cell 1 997, 91 :501 -510. 
This important work is about the reconstruction of an active transposase from 
1 2 pseudogenes found in eight (Afferent fish species and using a modified 
consensus sequence. The approach used has implications for the recon* 
stnjctksn of other proteins involved in proliferation of transposable elements, 
for the engineering of new transposable elements, and for genome studies. 

9. Gueiros-Fitho FJ, Beverley SM: Trans-kingdom transposition of the 
Drosophila element mariner within the protozoan Leishmania. 
Science 1 997, 276:1 71 6- 1 7 1 9. 

The authors demonstrate the efficient transfer of the Drosophila maun'tania 
mariner element into the human parasite Leishmania major. This, and recer^t 
experiments with a reconstructed transposase [8"*}. clearly demonstrate the 
feasbility of genetk: studies on a wide variety of species using DNA 
transposable elements. 

10. Repbase Updafe 1997 on Worid Wide Web URL: 

• http://www.girin&t.org/- server/repbase.html 

This is a collective attempt to organize the explosively growing rujmber and 
variety of repetitive sequences. Repbase Update includes many consensus 
sequences of transposable elements and their biological characterization 
that are unreported anyvk^ere else. 

1 1 . Genetic Information Research Institute oh the World Wide Web URL: 
. Http*7/charon.girinst.org 

12. Smit AFA: The origin of interspersed repeats In the human 
genome. Curr Opin Genet Dev 1 996, 6:743-748. 

13. Surge C, Karfin S: Prediction of complete gane structures in 
human genomic DNA. J Mol Biol 1 997, 268:78-94. 

14. Glaverie JM: Computational methods for the Identification of 
genes In vertebrate genomic sequences. Hum Mol Genet 1 997, 
6:1735-1744. 

1 5. Migheil AJ, Markham AF, Robinson PA: Alu sequences. FEBS Lett 
1997.417:1-5. 

1 6. Sherry ST, Harpending HC, Batzer f^A, Stoneking M: Alu evolution In 

• human populations: using the coalescent to estimate effective 
population size. Geneves 1 997, 1 47:1 977-1 982. 

This paper demonstrates a very hteresting a^^plicatkm of Alu polymorphism for 
estimating human effective populatkm size during the last 1~2 million years. 

17- Shimamura M. Yasue H, Ohshima K, Abe H, Kato H. Kishiro T. Goto 
•• M, Munechika I, Okada N: Molecular evidertce from retroposons 

that whales form a clade within even-toed ungulatesw Nature 1997, 

388:666-670. 

This paper addresses an important phylogenetic problem by innovative 
exploitation of selected repetitive sequences. This is a powerful example of 
how the genomic fossil record for some species can be more informative 
than the paleontologtcal record. 

1 8. Kapitonov V, Holmquist G, Jurka J: LI repeat is a basic unit of 

• heterochromatin satellites in cetaceans. Mot Biol Evol 1 998, 
15:611-612. 

This work has important implications for the understanding of the origin and 
evolution of satellite DNA. 

19. Halverson D, Qaum M, Stryker j, Carfc>on j, Clarke L: A centromere 
D^4A-binding protein from fission yeast effects chromosome 
segregation and has Iwmology to human CENP-B. J Cell Biol 
1997, 136:487-500. 

20. Kipling D, Warburton PE: Centromeres. CENP-B and Tiggerr too. 

Trends Genet 1997. 13:141-145. 

21 . Danilevskaya ON, Arkhipova IR, Traverse KL, Oardue ML: Promoting 
•• in tandem: the promoter for tialonrtere transposon HeT-A and 

implications for the evolution of retroviral LTRs. Cell 1 997, 
88:647-655. 

This work shows thai promoter activity in the retroposan HeT-A is kx^ated at 
its 3' end, in contrast to other retroposons. Tandemly arranged HeT-A 
elements share these 3' promoters with their downstream neighbors. The 
authors conclude that, because of its unusual structure. HeT-A resembles 
an evolutionary intermediate between non-LTR and LTR retrotransposons. 

22. Pardue ML, Danilevskaya ON, Traverse KL, Lowenhaupt K: 
Evolutionary links between telomeres and transposable elements. 
Genetica 1997, 100:73-84. 

23. Ugner J, Hughes TR, Shevchenko A, Mann M, Lundblad V, Cech TR: 
Reverse transcriptase motifs in the catalytic subunit of 
telomerase. Science 1 997, 276:56 1 -567 



Repeats In genomic DNA: mining and meaning Jurica 



337 



Telomerase catalytic subunits were first identified in Eupiotes aediculatus 
and Saccharomyces cerews/ae, and were shown to contain reverse 
transcriptase motifs. This paper- further demonstrates the fact that the 
reverse transcriptase motif is essential for normal chromosome telomere 
replication. This work brings together retroposition and chromosome 
maintenance and has profound, evolutionary implications. 

24. Nakamura TM, Gregg BM, Chapman KB. Weinrich SL. Andrews WH. 
Lingner J, Harley CB, Cech TR: Telomerase catalytic subunit 
homotogs from fission yeast and human. Science 1 997, 
277:955-959. 

This paper reveals that the catalytic subunits of telomerases 123**] have 
conserved domains common to all reverse transcriptases. These domains 
also revealed distinct hallmarks and the authors conclude that they 
represent a deep branch in the evolution of reverse transcriptaises, and 
perhaps origmated with the first eukaryote. 

25. Eickbush TH: Telomerase and retrotransposons: which came first? 
Science 1997. 277:91 1-912. 

26. Smit AFA, Toth G. Riggs AD, Jurka J: Ancestral mammalian-wide 
subfamilies of UNE-I repetithre sequences. J Mot Bio/ 1995. 
246:401 >41 7 

27. Sassaman DM, Dombroski BA, Morah JV. Ktmberland ML, Naas TP, 

• DeBerardinis RJ, Gabriel A, Swergold GD, Kazazian HH Jr: Many 
human LI elements are capable of retrotransposltion. Nat Genet 
1997.16:37-43. 

-This jsaper estimates the number of active LI copies in the human genome. 
Different Lis may account for the presence of different targets for 
retroposon integration, as discussed in the review. 

28. Naas TP. DeBerardinis RJ. Moran JV, Ostertag EM, Kingsmore SF. 
SekJin MF. Hayashizaki Y. Martin SL, Kazazian HH Jr: An actively 
retrotransposlng. novel subfamily of mouse LI elements. EMBO J 
1998.17:590-597. 

29. Okada N, Hamada M. Ogiwara I, Ohshima K: SINEs and UNEs 
share common 3' sequences: a review. Gene 1997, 205:229-243. 

30. Jurka J: Sequence patterns Indicate an enzymatic involvement in 
— integration of mammalian retroposons. Proc Natl Acad Set USA 

1997,94:1872-1877. 
This paper shows for the first time that the integration of SINE, Li and 
processed retropseudogenes occurs al nonrandom, consensus -defined 
sequence targets. This strongly links the Li retroposition machinery to the 
proliferation of nontUNE retroposons and has implications for 
understanding of the mechanism of retroposition. 

31 . Boeke JD: U NEs and Alus — the polyA connection. Nat Genet 
1997.16:6-7. 

32. Dhellin O. Miaestre J, Heidmann T: Functional differences between 

• the huntan LINE retrotransposbn and retroviral reverse 
transcriptases for in vivo mRNA reverse transcription. EMBO J 
1997, 16:6590-6602. 

This paper demonstrates the specific and high efficiency of LI reverse 
transcriptbn of RNA that has no sequence specificity. This is compatible 
with 'unselfish* aspects of Li previously discussed in this review. 



33. Teng SC. Kim B. Gatxiel A: Retrotransposon reverse-transcriptase- 
mediated repair of chromosonnal breaks. Nature 1 996, 383:641 -644. 

34. Lauermann V: DNA repair by recycling reverse transcripts. Nature 
1997. 386:31 -32. 

35. Vos JC. De Baere I. PlasteHc RHA: Transposase is the only 
nematode protein required for in vitro transposition of Tel. Genes 
Dev 1996. 10.755-761. 

36. Hart! DL. Lozovskaya ER, Numninsky Dl, Lohe AR: What restricts the 
activity of mariner-like transposable elements? Trends Genet 
1997, 13:197-201. 

37 Jurka J, Klonowski P: Integration of ratroposabte elements in 

mammals: selection of target sites. Moi Evol 1 996. 43:685-689. 

38. Yoder JA. Walsh CP, Bestor TH: Cytostne methylation and the 
ecology of intragenomlc parasites. Trends Genet 1 997, 13:335-340. 

39. Bird A: Does DNA methyletion control transposition of selfish 
elemertts in the germline? Trends Gene/ 1997, 13:469-470. 

40. Labrador M, Corces VG: Transposable element-host interactions: 
regulation of insertion and excision. Annu Rev Genet 1 997, 
31:381-404. 

4 1 . Van der Kuyl AC: Endogenous retrovirus sequences and ttieir 
usefulness to the host Trends Microbiol 1997, 5:339. 

42. Best S, Le Tissier PR, Stpye JP: Endogenous retroviruses and the 
evolution of resistance to retroviral Infection. Trends Microbiol 
1997.5:313-318. 

43. Feng Q, Moran JV, Kazazian HH Jr. Boeke JD: Hiinnan LI 
retrotransposon encodes a conserved endonuclease required for 
retrotransposltion. Cell 1 996, 87:905-91 6. 

This breakthrough paper demonstrates the presence of an endonucleolytic 
domain in LI -encoded reverse transcnptase, implying that reverse 
transcription in mammals is primed by the 3' DNA ends that are exposed by 
. nicking, as previously established in insects [45]. 

44. Jurka J, Klonowski P, Trifonov EN: Mammalian retroposons integrate 
• at kinkable DNA sites. J Biomol Struct Dyn 1998,15:717-721. 
Sequence data indicate that the integration of retroposons and other TEs 
may be associated with the formatran of DNA kinks. This suggests the 
presence of universal structural features associated with the integration 
of TEs. 

45. Luan DD, Korman MH, Jakubczak JL, Eickbush TH: Reverse 
transcription of R2Bm RNA is primed by a nick at the 
chromosomal target site: e mechanism for non-LTR 
retrotransposltion. Cell 1993, 72:595-605. 

46. Shen MR. Brosius J. Deininger PL: BCI RNA. the transcript from a 
master gene for ID element amplification, is able to prime its own 
reverse transcription. Nucleic Acids Res 1 997, 25: 1 64 1 -1 648. 



